6 research outputs found

    Phonetic accommodation of human interlocutors in the context of human-computer interaction

    Get PDF
    Phonetic accommodation refers to the phenomenon that interlocutors adapt their way of speaking to each other within an interaction. This can have a positive influence on the communication quality. As we increasingly use spoken language to interact with computers these days, the phenomenon of phonetic accommodation is also investigated in the context of human-computer interaction: on the one hand, to find out whether speakers adapt to a computer agent in a similar way as they do to a human interlocutor, on the other hand, to implement accommodation behavior in spoken dialog systems and explore how this affects their users. To date, the focus has been mainly on the global acoustic-prosodic level. The present work demonstrates that speakers interacting with a computer agent also identify locally anchored phonetic phenomena such as segmental allophonic variation and local prosodic features as accommodation targets and converge on them. To this end, we conducted two experiments. First, we applied the shadowing method, where the participants repeated short sentences from natural and synthetic model speakers. In the second experiment, we used the Wizard-of-Oz method, in which an intelligent spoken dialog system is simulated, to enable a dynamic exchange between the participants and a computer agent — the virtual language learning tutor Mirabella. The target language of our experiments was German. Phonetic convergence occurred in both experiments when natural voices were used as well as when synthetic voices were used as stimuli. Moreover, both native and non-native speakers of the target language converged to Mirabella. Thus, accommodation could be relevant, for example, in the context of computer-assisted language learning. Individual variation in accommodation behavior can be attributed in part to speaker-specific characteristics, one of which is assumed to be the personality structure. We included the Big Five personality traits as well as the concept of mental boundaries in the analysis of our data. Different personality traits influenced accommodation to different types of phonetic features. Mental boundaries have not been studied before in the context of phonetic accommodation. We created a validated German adaptation of a questionnaire that assesses the strength of mental boundaries. The latter can be used in future studies involving mental boundaries in native speakers of German.Bei phonetischer Akkommodation handelt es sich um das PhĂ€nomen, dass GesprĂ€chspartner ihre Sprechweise innerhalb einer Interaktion aneinander anpassen. Dies kann die QualitĂ€t der Kommunikation positiv beeinflussen. Da wir heutzutage immer öfter mittels gesprochener Sprache mit Computern interagieren, wird das PhĂ€nomen der phonetischen Akkommodation auch im Kontext der Mensch-Computer-Interaktion untersucht: zum einen, um herauszufinden, ob sich Sprecher an einen Computeragenten in Ă€hnlicher Weise anpassen wie an einen menschlichen GesprĂ€chspartner, zum anderen, um das Akkommodationsverhalten in Sprachdialogsysteme zu implementieren und zu erforschen, wie dieses auf ihre Benutzer wirkt. Bislang lag der Fokus dabei hauptsĂ€chlich auf der globalen akustisch-prosodischen Ebene. Die vorliegende Arbeit zeigt, dass Sprecher in Interaktion mit einem Computeragenten auch lokal verankerte phonetische PhĂ€nomene wie segmentale allophone Variation und lokale prosodische Merkmale als Akkommodationsziele identifizieren und in Bezug auf diese konvergieren. Dabei wendeten wir in einem ersten Experiment die Shadowing-Methode an, bei der die Teilnehmer kurze SĂ€tze von natĂŒrlichen und synthetischen Modellsprechern wiederholten. In einem zweiten Experiment ermöglichten wir mit der Wizard-of-Oz-Methode, bei der ein intelligentes Sprachdialogsystem simuliert wird, einen dynamischen Austausch zwischen den Teilnehmern und einem Computeragenten — der virtuellen Sprachlerntutorin Mirabella. Die Zielsprache unserer Experimente war Deutsch. Phonetische Konvergenz trat in beiden Experimenten sowohl bei Verwendung natĂŒrlicher Stimmen als auch bei Verwendung synthetischer Stimmen als Stimuli auf. Zudem konvergierten sowohl Muttersprachler als auch Nicht-Muttersprachler der Zielsprache zu Mirabella. Somit könnte Akkommodation zum Beispiel im Kontext des computergstĂŒtzten Sprachenlernens zum Tragen kommen. Individuelle Variation im Akkommodationsverhalten kann unter anderem auf sprecherspezifische Eigenschaften zurĂŒckgefĂŒhrt werden. Es wird vermutet, dass zu diesen auch die Persönlichkeitsstruktur gehört. Wir bezogen die Big Five Persönlichkeitsmerkmale sowie das Konzept der mentalen Grenzen in die Analyse unserer Daten ein. Verschiedene Persönlichkeitsmerkmale beeinflussten die Akkommodation zu unterschiedlichen Typen von phonetischen Merkmalen. Die mentalen Grenzen sind im Zusammenhang mit phonetischer Akkommodation zuvor noch nicht untersucht worden. Wir erstellten eine validierte deutsche Adaptierung eines Fragebogens, der die StĂ€rke der mentalen Grenzen erhebt. Diese kann in zukĂŒnftigen Untersuchungen mentaler Grenzen bei Muttersprachlern des Deutschen verwendet werden.Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 278805297: "Phonetische Konvergenz in der Mensch-Maschine-Kommunikation

    Phonetic accommodation to natural and synthetic voices : Behavior of groups and individuals in speech shadowing

    Get PDF
    The present study investigates whether native speakers of German phonetically accommodate to natural and synthetic voices in a shadowing experiment. We aim to determine whether this phenomenon, which is frequently found in HHI, also occurs in HCI involving synthetic speech. The examined features pertain to different phonetic domains: allophonic variation, schwa epenthesis, realization of pitch accents, word-based temporal structure and distribution of spectral energy. On the individual level, we found that the participants converged to varying subsets of the examined features, while they maintained their baseline behavior in other cases or, in rare instances, even diverged from the model voices. This shows that accommodation with respect to one particular feature may not predict the behavior with respect to another feature. On the group level, the participants of the natural condition converged to all features under examination, however very subtly so for schwa epenthesis. The synthetic voices, while partly reducing the strength of effects found for the natural voices, triggered accommodating behavior as well. The predominant pattern for all voice types was convergence during the interaction followed by divergence after the interaction

    Phonetic accommodation in interaction with a virtual language learning tutor: A Wizard-of-Oz study

    Get PDF
    We present a Wizard-of-Oz experiment examining phonetic accommodation of human interlocutors in the context of human-computer interaction. Forty-two native speakers of German engaged in dynamic spoken interaction with a simulated virtual tutor for learning the German language called Mirabella. Mirabella was controlled by the experimenter and used either natural or hidden Markov model-based synthetic speech to communicate with the participants. In the course of four tasks, the participants’ accommodating behavior with respect to wh-question realization and allophonic variation in German was tested. The participants converged to Mirabella with respect to modified wh-question intonation, i.e., rising F0 contour and nuclear pitch accent on the interrogative pronoun, and the allophonic contrast [ÉȘç] vs. [ÉȘk] occurring in the word ending -ig. They did not accommodate to the allophonic contrast [ɛː] vs. [eː] as a realization of the long vowel -Ă€-. The results did not differ between the experimental groups that communicated with either the natural or the synthetic speech version of Mirabella. Testing the influence of the “Big Five” personality traits on the accommodating behavior revealed a tendency for neuroticism to influence the convergence of question intonation. On the level of individual speakers, we found considerable variation with respect to the degree and direction of accommodation. We conclude that phonetic accommodation on the level of local prosody and segmental pronunciation occurs in users of spoken dialog systems, which could be exploited in the context of computer-assisted language learning

    The Partner Modelling Questionnaire: A validated self-report measure of perceptions toward machines as dialogue partners

    Full text link
    Recent work has looked to understand user perceptions of speech agent capabilities as dialogue partners (termed partner models), and how this affects user interaction. Yet, currently partner model effects are inferred from language production as no metrics are available to quantify these subjective perceptions more directly. Through three studies, we develop and validate the Partner Modelling Questionnaire (PMQ): an 18-item self-report semantic differential scale designed to reliably measure people's partner models of non-embodied speech interfaces. Through principal component analysis and confirmatory factor analysis, we show that the PMQ scale consists of three factors: communicative competence and dependability, human-likeness in communication, and communicative flexibility. Our studies show that the measure consistently demonstrates good internal reliability, strong test-retest reliability over 12 and 4-week intervals, and predictable convergent/divergent validity. Based on our findings we discuss the multidimensional nature of partner models, whilst identifying key future research avenues that the development of the PMQ facilitates. Notably, this includes the need to identify the activation, sensitivity, and dynamism of partner models in speech interface interaction.Comment: Submitted (TOCHI

    Audience design and egocentrism in reference production during human-computer dialogue

    Get PDF
    Our current understanding of the mechanisms that underpin language production in human-computer dialogue (HCD) is sparse. What work there is in the field of human-computer interaction (HCI) supposes that people tend to adapt their language allocentrically, taking into account the perceived limitations of their partners, when talking to computers. Yet, debates in human-human dialogue (HHD) research suggest that people may also act egocentrically when producing language in dialogue. Our research aims to identify whether, similar to HHD, users also produce egocentric language within speech-based HCD interactions and how this behaviour compares to interaction with human dialogue partners. Such knowledge benefits the field of HCI by better understanding the mechanisms present in language production during HCD, which can be used to build more nuanced theories and models of user behaviour to inform research and design of speech interfaces. Through two controlled experiments using an adapted director-matcher task similar to those used in research on perspective-taking in psycholinguistics, we show that people do take the computer's perspective into account less (i.e. behave more egocentrically) during HCD than in HHD (Experiment 1). However, this egocentric effect is eliminated when computers are framed as separate interlocutors rather than computers integrated in the interactive system and where differences in perspective are made salient, leading to similar levels of perspective-taking as with human partners (Experiment 2). We discuss the findings, emphasising potential explanations for this effect, focusing on how egocentric and allocentric production processes may interact, along with the impact of partner roles and the division of labour in HCD as an underlying explanation for the effects seen.</p
    corecore